Improving statistical natural concept generation in interlingua-based speech-to-speech translation
نویسندگان
چکیده
Natural concept generation is critical to statistical interlinguabased speech translation performance. To improve maximumentropy-based concept generation, a set of novel features and algorithms are proposed including features enabling model training on parallel corpora, employment of confidence thresholds and multiple sets of features. The concept generation error rate is reduced by 43%-50% in our speech translation corpus within limited domains. Improvements are also achieved in our experiments on speech-to-speech translation.
منابع مشابه
Use of maximum entropy in natural word generation for statistical concept-based speech-to-speech translation
Our statistical concept-based spoken language translation method consists of three cascaded components: natural language understanding, natural concept generation and natural word generation. In the previous approaches, statistical models are used only in the first two components. In this paper, a novel maximum-entropy-based statistical natural word generation algorithm is proposed that takes i...
متن کاملSistema de Traducción Oral para el Castellano, Catalán e Inglés
This paper describes the FAME Interlingua-based Speech-to-Speech Translation System for Catalan, English and Spanish. This is an extension of the already existing NESPOLE! that translates between English, French, German and Italian, but all modules have now been integrated in an Open Agen Architecture. This article describes the system architecture and the interlingua formalism used, called Int...
متن کاملApproach to interchange-format based Chinese generation
Interlingua-based machine translation is an important approach to implement multi-lingual speech-to-speech (S2S) translation. The natural language generation (NLG) is one of the key components in the interlingua-based machine translation systems. This paper introduces our approach to Chinese generation based on the Interchange Format (IF) developed by the C-STAR organization. In our approach, t...
متن کاملKorean Language Generation in an Interlingua-based Speech Translation System
Group 24 at MIT Lincoln Laboratory has been developing an automatic speech-tospeech translation system for the English-Korean pair. For the machine translation module, an interlingua system has been adopted. This system analyzes the source language text and represents the results of the analysis in a semantic frame, an unambiguous textual-meaning propositional representation language, from whic...
متن کاملDeveloping Non-European Translation Pairs in a Medium-Vocabulary Medical Speech Translation System
We describe recent work on MedSLT, a medium-vocabulary interlingua-based medical speech translation system, focussing on issues that arise when handling languages of which the grammar engineer has little or no knowledge. We describe how we can systematically create and maintain multiple forms of grammars, lexica and interlingual representations, with some versions being used by language informa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003